research-article

Modeling and synthesizing task placement constraints in Google compute clusters

Authors:
Bikash Sharma

Pennsylvania State University, University Park

Pennsylvania State University, University Park
View Profile

,
Victor Chudnovsky

Google Inc., Seattle

Google Inc., Seattle
View Profile

,
Joseph L. Hellerstein

Google Inc., Seattle

Google Inc., Seattle
View Profile

,
Rasekh Rifaat

Google Inc., Seattle

Google Inc., Seattle
View Profile

,
Chita R. Das

Pennsylvania State University, University Park

Pennsylvania State University, University Park
View Profile

SOCC '11: Proceedings of the 2nd ACM Symposium on Cloud ComputingOctober 2011Article No.: 3Pages 1–14https://doi.org/10.1145/2038916.2038919

Published:26 October 2011Publication History

SOCC '11: Proceedings of the 2nd ACM Symposium on Cloud Computing

Pages 1–14

ABSTRACT

Evaluating the performance of large compute clusters requires benchmarks with representative workloads. At Google, performance benchmarks are used to obtain performance metrics such as task scheduling delays and machine resource utilizations to assess changes in application codes, machine configurations, and scheduling algorithms. Existing approaches to workload characterization for high performance computing and grids focus on task resource requirements for CPU, memory, disk, I/O, network, etc. Such resource requirements address how much resource is consumed by a task. However, in addition to resource requirements, Google workloads commonly include task placement constraints that determine which machine resources are consumed by tasks. Task placement constraints arise because of task dependencies such as those related to hardware architecture and kernel version.

This paper develops methodologies for incorporating task placement constraints and machine properties into performance benchmarks of large compute clusters. Our studies of Google compute clusters show that constraints increase average task scheduling delays by a factor of 2 to 6, which often results in tens of minutes of additional task wait time. To understand why, we extend the concept of resource utilization to include constraints by introducing a new metric, the Utilization Multiplier (UM). UM is the ratio of the resource utilization seen by tasks with a constraint to the average utilization of the resource. UM provides a simple model of the performance impact of constraints in that task scheduling delays increase with UM. Last, we describe how to synthesize representative task constraints and machine properties, and how to incorporate this synthesis into existing performance benchmarks. Using synthetic task constraints and machine properties generated by our methodology, we accurately reproduce performance metrics for benchmarks of Google compute clusters with a discrepancy of only 13% in task scheduling delay and 5% in resource utilization.

References

M. F. Arlitt and C. L. Williamson. Web server workload characterization: the search for invariants. In Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1996. Google ScholarDigital Library
P. Barford and M. Crovella. Generating representative web workloads for network and server performance evaluation. In Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1998. Google ScholarDigital Library
J. Brevik, D. Nurmi, and R. Wolski. Predicting bounds on queuing delay for batch-scheduled parallel machines. In Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming. ACM, 2006. Google ScholarDigital Library
M. Calzarossa and D. Ferrari. A sensitivity study of the clustering approach to workload modeling. SIGMETRICS Performance Evaluation Review, 1986. Google ScholarDigital Library
S. J. Chapin, W. Cirne, D. G. Feitelson, J. P. Jones, S. T. Leutenegger, U. Schwiegelshohn, W. Smith, and D. Talby. Benchmarks and standards for the evaluation of parallel job schedulers. In Proceedings of the Job Scheduling Strategies for Parallel Processing, 1999. Google ScholarDigital Library
Y. Chen, A. S. Ganapathi, R. Griffith, and R. H. Katz. Analysis and lessons from a publicly available google cluster trace. Technical Report UCB/EECS-2010-95, UC Berkeley, 2010.Google Scholar
Y. Chen, A. S. Ganapathi, R. Griffith, and R. H. Katz. A methodology for understanding mapreduce performance under diverse workloads. Technical Report UCB/EECS-2010-135, UC Berkeley, 2010.Google Scholar
Y. Chen, A. S. Ganapathi, R. Griffith, and R. H. Katz. Towards understanding cloud performance tradeoffs using statistical workload analysis and replay. Technical Report UCB/EECS-2010-81, UC Berkeley, 2010.Google Scholar
W. Cirne and F. Berman. A comprehensive model of the supercomputer workload. In Proceedings of the IEEE Workload Characterization Workshop, 2001. Google ScholarDigital Library
Condor Project. University of Wisconsin, http://www.cs.wisc.edu/condor.Google Scholar
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with ycsb. In Proceedings of the 1st ACM symposium on Cloud computing, 2010. Google ScholarDigital Library
J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Commun. ACM, 2008. Google ScholarDigital Library
Y. Denneulin, E. Romagnoli, and D. Trystram. A synthetic workload generator for cluster computing. Parallel and Distributed Processing Symposium, 2004.Google ScholarCross Ref
D. Ersoz, M. S. Yousif, and C. R. Das. Characterizing network traffic in a cluster-based, multi-tier data center. Distributed Computing Systems, International Conference on, 2007. Google ScholarDigital Library
I. Foster and C. Kesselman. Globus: A metacomputing infrastructure toolkit. International Journal of Supercomputer Applications, 1996.Google Scholar
A. S. Ganapathi, Y. Chen, A. Fox, R. H. Katz, and D. A. Patterson. Statistics-driven workload modeling for the cloud. Technical Report UCB/EECS-2009-160, UC Berkeley, 2009.Google Scholar
J. A. Hartigan and M. A. Wong. A K-means clustering algorithm. Applied Statistics, 1979.Google Scholar
IBM. IBM LoadLeveler. http://www.redbooks.ibm.com/redbooks/pdfs/sg246038.pdf.Google Scholar
S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan. An analysis of traces from a production mapreduce cluster. IEEE Cluster Computing and the Grid, 2010. Google ScholarDigital Library
K. Kim, K. Jeon, H. Han, S.-g. Kim, H. Jung, and H. Y. Yeom. Mrbench: A benchmark for mapreduce framework. In Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems, 2008. Google ScholarDigital Library
L. Kleinrock. Queueing Systems. Wiley-Interscience, 2nd edition, 1975.Google Scholar
A. W. Leung, S. Pasupathy, G. Goodson, and E. L. Miller. Measurement and analysis of large-scale network file system workloads. In USENIX Annual Technical Conference, 2008. Google ScholarDigital Library
U. Lublin and D. G. Feitelson. The workload on parallel supercomputers: modeling the characteristics of rigid jobs. Journal of Parallel Distributed Computing, 2003. Google ScholarDigital Library
A. K. Mishra, J. L. Hellerstein, W. Cirne, and C. R. Das. Towards characterizing cloud backend workloads: insights from google compute clusters. SIGMETRICS Performance Evaluation Review, 2009. Google ScholarDigital Library
D. Nurmi, J. Brevik, and R. Wolski. Qbets: Queue bounds estimation from time series. In In Proceedings of Job Scheduling Strategies for Parallel Processing (JSSPP), 2007. Google ScholarDigital Library
R. Raman, M. Livny, and M. Solomon. Matchmaking: Distributed resource management for high throughput computing. In Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing, 1998. Google ScholarDigital Library
B. Schroeder, E. Pinheiro, and W.-D. Weber. Dram errors in the wild: a large-scale field study. In Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems, SIGMETRICS, 2009. Google ScholarDigital Library
B. Sharma, V. Chudnovsky, H. J. L., R. Rifaat, and C. R. Das. Modeling and synthesizing task placement constraints in google compute clusters. Technical Report CSE#11-005, CSE Dept., Pennsylvania State University, 2011.Google ScholarDigital Library
W. Smith, V. Taylor, and I. Foster. Using run-time predictions to estimate queue wait times and improve scheduler performance. In Scheduling Strategies for Parallel Processing. Springer-Verlag, 1999. Google ScholarDigital Library
W. Sobel, S. Subramanyam, A. Sucharitakul, J. Nguyen, H. Wong, A. Klepchukov, S. Patil, O. Fox, and D. Patterson. Cloudstone: Multi-platform, multi-language benchmark and measurement tools for web 2.0, 2008.Google Scholar
D. Thain, T. Tannenbaum, and M. Livny. Distributed computing in practice: the condor experience. 1975.Google Scholar
S. Zhou, X. Zheng, J. Wang, and P. Delisle. Utopia: a load sharing facility for large, heterogeneous distributed computer systems. Softw. Pract. Exper., 1993. Google ScholarDigital Library

Index Terms

Recommendations

Large System Performance of SPEC OMP2001 Benchmarks
ISHPC '02: Proceedings of the 4th International Symposium on High Performance Computing

Performance characteristics of application programs on large-scale systems are often significantly different from those on smaller systems. SPEC OMP2001 is a benchmark suite intended for measuring performance of modern shared memory parallel systems. ...
Read More
Large System Performance of SPEC OMP2001 Benchmarks
ISHPC '02: Proceedings of the 4th International Symposium on High Performance Computing

Performance characteristics of application programs on large-scale systems are often significantly different from those on smaller systems. SPEC OMP2001 is a benchmark suite intended for measuring performance of modern shared memory parallel systems. ...
Read More
A Benchmark Characterization of the EEMBC Benchmark Suite

Benchmark consumers expect benchmark suites to be complete, accurate, and consistent, and benchmark scores serve as relative measures of performance. However, it is important to understand how benchmarks stress the processors that they aim to test. This ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SOCC '11: Proceedings of the 2nd ACM Symposium on Cloud Computing
October 2011
377 pages
ISBN:9781450309769
DOI:10.1145/2038916
Program Chairs:
Jeffrey S. Chase
Duke University
,
Amr El Abbadi
Univ of California, Santa Barbara
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 October 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
benchmarking
benchmarks
metrics
modeling
performance evaluation
workload characterization
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate169of722submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 131
  Total Citations
  View Citations
- 1,101
  Total Downloads
- Downloads (Last 12 months)21
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Modeling and synthesizing task placement constraints in Google compute clusters

SOCC '11: Proceedings of the 2nd ACM Symposium on Cloud Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Large System Performance of SPEC OMP2001 Benchmarks

Large System Performance of SPEC OMP2001 Benchmarks

A Benchmark Characterization of the EEMBC Benchmark Suite

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Modeling and synthesizing task placement constraints in Google compute clusters

SOCC '11: Proceedings of the 2nd ACM Symposium on Cloud Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Large System Performance of SPEC OMP2001 Benchmarks

Large System Performance of SPEC OMP2001 Benchmarks

A Benchmark Characterization of the EEMBC Benchmark Suite

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media